Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 78
Filtrar
1.
Genome Res ; 34(1): 119-133, 2024 Feb 07.
Artigo em Inglês | MEDLINE | ID: mdl-38190633

RESUMO

Single-cell technologies offer unprecedented opportunities to dissect gene regulatory mechanisms in context-specific ways. Although there are computational methods for extracting gene regulatory relationships from scRNA-seq and scATAC-seq data, the data integration problem, essential for accurate cell type identification, has been mostly treated as a standalone challenge. Here we present scTIE, a unified method that integrates temporal multimodal data and infers regulatory relationships predictive of cellular state changes. scTIE uses an autoencoder to embed cells from all time points into a common space by using iterative optimal transport, followed by extracting interpretable information to predict cell trajectories. Using a variety of synthetic and real temporal multimodal data sets, we show scTIE achieves effective data integration while preserving more biological signals than existing methods, particularly in the presence of batch effects and noise. Furthermore, on the exemplar multiome data set we generated from differentiating mouse embryonic stem cells over time, we show scTIE captures regulatory elements highly predictive of cell transition probabilities, providing new potentials to understand the regulatory landscape driving developmental processes.


Assuntos
Perfilação da Expressão Gênica , Análise de Célula Única , Animais , Camundongos , Perfilação da Expressão Gênica/métodos , Análise de Célula Única/métodos , Regulação da Expressão Gênica
2.
bioRxiv ; 2023 May 22.
Artigo em Inglês | MEDLINE | ID: mdl-37292801

RESUMO

Single-cell technologies offer unprecedented opportunities to dissect gene regulatory mechanisms in context-specific ways. Although there are computational methods for extracting gene regulatory relationships from scRNA-seq and scATAC-seq data, the data integration problem, essential for accurate cell type identification, has been mostly treated as a standalone challenge. Here we present scTIE, a unified method that integrates temporal multimodal data and infers regulatory relationships predictive of cellular state changes. scTIE uses an autoencoder to embed cells from all time points into a common space using iterative optimal transport, followed by extracting interpretable information to predict cell trajectories. Using a variety of synthetic and real temporal multimodal datasets, we demonstrate scTIE achieves effective data integration while preserving more biological signals than existing methods, particularly in the presence of batch effects and noise. Furthermore, on the exemplar multiome dataset we generated from differentiating mouse embryonic stem cells over time, we demonstrate scTIE captures regulatory elements highly predictive of cell transition probabilities, providing new potentials to understand the regulatory landscape driving developmental processes.

3.
Nature ; 618(7964): 383-393, 2023 Jun.
Artigo em Inglês | MEDLINE | ID: mdl-37258665

RESUMO

The earliest events during human tumour initiation, although poorly characterized, may hold clues to malignancy detection and prevention1. Here we model occult preneoplasia by biallelic inactivation of TP53, a common early event in gastric cancer, in human gastric organoids. Causal relationships between this initiating genetic lesion and resulting phenotypes were established using experimental evolution in multiple clonally derived cultures over 2 years. TP53 loss elicited progressive aneuploidy, including copy number alterations and structural variants prevalent in gastric cancers, with evident preferred orders. Longitudinal single-cell sequencing of TP53-deficient gastric organoids similarly indicates progression towards malignant transcriptional programmes. Moreover, high-throughput lineage tracing with expressed cellular barcodes demonstrates reproducible dynamics whereby initially rare subclones with shared transcriptional programmes repeatedly attain clonal dominance. This powerful platform for experimental evolution exposes stringent selection, clonal interference and a marked degree of phenotypic convergence in premalignant epithelial organoids. These data imply predictability in the earliest stages of tumorigenesis and show evolutionary constraints and barriers to malignant transformation, with implications for earlier detection and interception of aggressive, genome-instable tumours.


Assuntos
Transformação Celular Neoplásica , Evolução Clonal , Lesões Pré-Cancerosas , Seleção Genética , Neoplasias Gástricas , Humanos , Transformação Celular Neoplásica/genética , Transformação Celular Neoplásica/patologia , Evolução Clonal/genética , Instabilidade Genômica , Mutação , Neoplasias Gástricas/genética , Neoplasias Gástricas/patologia , Lesões Pré-Cancerosas/genética , Lesões Pré-Cancerosas/patologia , Organoides/metabolismo , Organoides/patologia , Aneuploidia , Variações do Número de Cópias de DNA , Análise de Célula Única , Proteína Supressora de Tumor p53/deficiência , Proteína Supressora de Tumor p53/genética , Progressão da Doença , Linhagem da Célula
4.
Proc Natl Acad Sci U S A ; 120(15): e2216698120, 2023 04 11.
Artigo em Inglês | MEDLINE | ID: mdl-37023129

RESUMO

Discovering DNA regulatory sequence motifs and their relative positions is vital to understanding the mechanisms of gene expression regulation. Although deep convolutional neural networks (CNNs) have achieved great success in predicting cis-regulatory elements, the discovery of motifs and their combinatorial patterns from these CNN models has remained difficult. We show that the main difficulty is due to the problem of multifaceted neurons which respond to multiple types of sequence patterns. Since existing interpretation methods were mainly designed to visualize the class of sequences that can activate the neuron, the resulting visualization will correspond to a mixture of patterns. Such a mixture is usually difficult to interpret without resolving the mixed patterns. We propose the NeuronMotif algorithm to interpret such neurons. Given any convolutional neuron (CN) in the network, NeuronMotif first generates a large sample of sequences capable of activating the CN, which typically consists of a mixture of patterns. Then, the sequences are "demixed" in a layer-wise manner by backward clustering of the feature maps of the involved convolutional layers. NeuronMotif can output the sequence motifs, and the syntax rules governing their combinations are depicted by position weight matrices organized in tree structures. Compared to existing methods, the motifs found by NeuronMotif have more matches to known motifs in the JASPAR database. The higher-order patterns uncovered for deep CNs are supported by the literature and ATAC-seq footprinting. Overall, NeuronMotif enables the deciphering of cis-regulatory codes from deep CNs and enhances the utility of CNN in genome interpretation.


Assuntos
Algoritmos , Redes Neurais de Computação , Motivos de Nucleotídeos/genética , Sequências Reguladoras de Ácido Nucleico/genética , Bases de Dados Factuais
5.
Science ; 380(6641): eabn7113, 2023 04 14.
Artigo em Inglês | MEDLINE | ID: mdl-37053313

RESUMO

Postzygotic mutations (PZMs) begin to accrue in the human genome immediately after fertilization, but how and when PZMs affect development and lifetime health remain unclear. To study the origins and functional consequences of PZMs, we generated a multitissue atlas of PZMs spanning 54 tissue and cell types from 948 donors. Nearly half the variation in mutation burden among tissue samples can be explained by measured technical and biological effects, and 9% can be attributed to donor-specific effects. Through phylogenetic reconstruction of PZMs, we found that their type and predicted functional impact vary during prenatal development, across tissues, and through the germ cell life cycle. Thus, methods for interpreting effects across the body and the life span are needed to fully understand the consequences of genetic variants.


Assuntos
Análise Mutacional de DNA , Longevidade , Zigoto , Feminino , Humanos , Longevidade/genética , Mutação , Filogenia , RNA-Seq
6.
Bioinformatics ; 38(6): 1491-1496, 2022 03 04.
Artigo em Inglês | MEDLINE | ID: mdl-34978563

RESUMO

MOTIVATION: Isoform deconvolution is an NP-hard problem. The accuracy of the proposed solutions is far from perfect. At present, it is not known if gene structure and isoform concentration can be uniquely inferred given paired-end reads, and there is no objective method to select the fragment length to improve the number of identifiable genes. Different pieces of evidence suggest that the optimal fragment length is gene-dependent, stressing the need for a method that selects the fragment length according to a reasonable trade-off across all the genes in the whole genome. RESULTS: A gene is considered to be identifiable if it is possible to get both the structure and concentration of its transcripts univocally. Here, we present a method to state the identifiability of this deconvolution problem. Assuming a given transcriptome and that the coverage is sufficient to interrogate all junction reads of the transcripts, this method states whether or not a gene is identifiable given the read length and fragment length distribution. Applying this method using different read and fragment length combinations, the optimal average fragment length for the human transcriptome is around 400-600 nt for coding genes and 150-200 nt for long non-coding RNAs. The optimal read length is the largest one that fits in the fragment length. It is also discussed the potential profit of combining several libraries to reconstruct the transcriptome. Combining two libraries of very different fragment lengths results in a significant improvement in gene identifiability. AVAILABILITY AND IMPLEMENTATION: Code is available in GitHub (https://github.com/JFerrer-B/transcriptome-identifiability). SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.


Assuntos
Genoma , Transcriptoma , Humanos , RNA-Seq , Biblioteca Gênica , Isoformas de Proteínas/genética , Software
7.
Nat Biotechnol ; 40(5): 703-710, 2022 05.
Artigo em Inglês | MEDLINE | ID: mdl-35058621

RESUMO

Single-cell multiomics data continues to grow at an unprecedented pace. Although several methods have demonstrated promising results in integrating several data modalities from the same tissue, the complexity and scale of data compositions present in cell atlases still pose a challenge. Here, we present scJoint, a transfer learning method to integrate atlas-scale, heterogeneous collections of scRNA-seq and scATAC-seq data. scJoint leverages information from annotated scRNA-seq data in a semisupervised framework and uses a neural network to simultaneously train labeled and unlabeled data, allowing label transfer and joint visualization in an integrative framework. Using atlas data as well as multimodal datasets generated with ASAP-seq and CITE-seq, we demonstrate that scJoint is computationally efficient and consistently achieves substantially higher cell-type label accuracy than existing methods while providing meaningful joint visualizations. Thus, scJoint overcomes the heterogeneity of different data modalities to enable a more comprehensive understanding of cellular phenotypes.


Assuntos
Sequenciamento de Cromatina por Imunoprecipitação , Análise de Célula Única , Aprendizado de Máquina , RNA-Seq , Análise de Sequência de RNA , Análise de Célula Única/métodos , Sequenciamento do Exoma
8.
Epigenetics ; 17(2): 220-238, 2022.
Artigo em Inglês | MEDLINE | ID: mdl-34304711

RESUMO

Germline or somatic variation in the family of KMT2 lysine methyltransferases have been associated with a variety of congenital disorders and cancers. Notably, KMT2A-fusions are prevalent in 70% of infant leukaemias but fail to phenocopy short latency leukaemogenesis in mammalian models, suggesting additional factors are necessary for transformation. Given the lack of additional somatic mutation, the role of epigenetic regulation in cell specification, and our prior results of germline KMT2C variation in infant leukaemia patients, we hypothesized that germline dysfunction of KMT2C altered haematopoietic specification. In isogenic KMT2C KO hPSCs, we found genome-wide differences in histone modifications at active and poised enhancers, leading to gene expression profiles akin to mesendoderm rather than mesoderm highlighted by a significant increase in NODAL expression and WNT inhibition, ultimately resulting in a lack of in vitro hemogenic endothelium specification. These unbiased multi-omic results provide new evidence for germline mechanisms increasing risk of early leukaemogenesis.


Assuntos
Epigênese Genética , Hemangioblastos , Animais , Metilação de DNA , Epigenômica , Humanos , Mamíferos , Mutação
9.
J Clin Invest ; 131(20)2021 10 15.
Artigo em Inglês | MEDLINE | ID: mdl-34520398

RESUMO

Tumor-infiltrating myeloid cells contribute to the development of the immunosuppressive tumor microenvironment. Myeloid cell expression of arginase 1 (ARG1) promotes a protumor phenotype by inhibiting T cell function and depleting extracellular l-arginine, but the mechanism underlying this expression, especially in breast cancer, is poorly understood. In breast cancer clinical samples and in our mouse models, we identified tumor-derived GM-CSF as the primary regulator of myeloid cell ARG1 expression and local immune suppression through a gene-KO screen of breast tumor cell-produced factors. The induction of myeloid cell ARG1 required GM-CSF and a low pH environment. GM-CSF signaling through STAT3 and p38 MAPK and acid signaling through cAMP were required to activate myeloid cell ARG1 expression in a STAT6-independent manner. Importantly, breast tumor cell-derived GM-CSF promoted tumor progression by inhibiting host antitumor immunity, driving a significant accumulation of ARG1-expressing myeloid cells compared with lung and melanoma tumors with minimal GM-CSF expression. Blockade of tumoral GM-CSF enhanced the efficacy of tumor-specific adoptive T cell therapy and immune checkpoint blockade. Taken together, we show that breast tumor cell-derived GM-CSF contributes to the development of the immunosuppressive breast cancer microenvironment by regulating myeloid cell ARG1 expression and can be targeted to enhance breast cancer immunotherapy.


Assuntos
Arginase/fisiologia , Neoplasias da Mama/imunologia , Fator Estimulador de Colônias de Granulócitos e Macrófagos/fisiologia , Tolerância Imunológica , Células Mieloides/enzimologia , Microambiente Tumoral , Animais , Neoplasias da Mama/patologia , Linhagem Celular Tumoral , AMP Cíclico/fisiologia , Feminino , Humanos , Camundongos , Camundongos Endogâmicos C57BL
10.
Proc Natl Acad Sci U S A ; 118(30)2021 07 27.
Artigo em Inglês | MEDLINE | ID: mdl-34285077

RESUMO

Dysfunction in T cells limits the efficacy of cancer immunotherapy. We profiled the epigenome, transcriptome, and enhancer connectome of exhaustion-prone GD2-targeting HA-28z chimeric antigen receptor (CAR) T cells and control CD19-targeting CAR T cells, which present less exhaustion-inducing tonic signaling, at multiple points during their ex vivo expansion. We found widespread, dynamic changes in chromatin accessibility and three-dimensional (3D) chromosome conformation preceding changes in gene expression, notably at loci proximal to exhaustion-associated genes such as PDCD1, CTLA4, and HAVCR2, and increased DNA motif access for AP-1 family transcription factors, which are known to promote exhaustion. Although T cell exhaustion has been studied in detail in mice, we find that the regulatory networks of T cell exhaustion differ between species and involve distinct loci of accessible chromatin and cis-regulated target genes in human CAR T cell exhaustion. Deletion of exhaustion-specific candidate enhancers of PDCD1 suppress the expression of PD-1 in an in vitro model of T cell dysfunction and in HA-28z CAR T cells, suggesting enhancer editing as a path forward in improving cancer immunotherapy.


Assuntos
Cromatina/metabolismo , Neoplasias/terapia , Receptor de Morte Celular Programada 1/metabolismo , Receptores de Antígenos Quiméricos , Linfócitos T/fisiologia , Animais , Antígenos CD19 , Linhagem Celular , Cromatina/genética , Regulação Neoplásica da Expressão Gênica , Humanos , Camundongos , Receptor de Morte Celular Programada 1/genética
11.
Elife ; 102021 02 25.
Artigo em Inglês | MEDLINE | ID: mdl-33629655

RESUMO

A hallmark of aging is loss of differentiated cell identity. Aged Drosophila midgut differentiated enterocytes (ECs) lose their identity, impairing tissue homeostasis. To discover identity regulators, we performed an RNAi screen targeting ubiquitin-related genes in ECs. Seventeen genes were identified, including the deubiquitinase Non-stop (CG4166). Lineage tracing established that acute loss of Non-stop in young ECs phenocopies aged ECs at cellular and tissue levels. Proteomic analysis unveiled that Non-stop maintains identity as part of a Non-stop identity complex (NIC) containing E(y)2, Sgf11, Cp190, (Mod) mdg4, and Nup98. Non-stop ensured chromatin accessibility, maintaining the EC-gene signature, and protected NIC subunit stability. Upon aging, the levels of Non-stop and NIC subunits declined, distorting the unique organization of the EC nucleus. Maintaining youthful levels of Non-stop in wildtype aged ECs safeguards NIC subunits, nuclear organization, and suppressed aging phenotypes. Thus, Non-stop and NIC, supervise EC identity and protects from premature aging.


Assuntos
Senilidade Prematura/genética , Envelhecimento/genética , Proteínas de Drosophila/genética , Drosophila melanogaster/fisiologia , Enterócitos/fisiologia , Animais , Modelos Animais de Doenças , Proteínas de Drosophila/metabolismo , Feminino , Masculino , Fenótipo , Proteoma
12.
Science ; 367(6485): 1449-1454, 2020 03 27.
Artigo em Inglês | MEDLINE | ID: mdl-32217721

RESUMO

Somatic mutations acquired in healthy tissues as we age are major determinants of cancer risk. Whether variants confer a fitness advantage or rise to detectable frequencies by chance remains largely unknown. Blood sequencing data from ~50,000 individuals reveal how mutation, genetic drift, and fitness shape the genetic diversity of healthy blood (clonal hematopoiesis). We show that positive selection, not drift, is the major force shaping clonal hematopoiesis, provide bounds on the number of hematopoietic stem cells, and quantify the fitness advantages of key pathogenic variants, at single-nucleotide resolution, as well as the distribution of fitness effects (fitness landscape) within commonly mutated driver genes. These data are consistent with clonal hematopoiesis being driven by a continuing risk of mutations and clonal expansions that become increasingly detectable with age.


Assuntos
Envelhecimento , Evolução Biológica , Deriva Genética , Aptidão Genética , Hematopoese/genética , Seleção Genética , Frequência do Gene , Genética Populacional , Células-Tronco Hematopoéticas/citologia , Humanos , Modelos Genéticos , Mutação , Taxa de Mutação
14.
Nucleic Acids Res ; 47(8): 3846-3861, 2019 05 07.
Artigo em Inglês | MEDLINE | ID: mdl-30864654

RESUMO

HepG2 is one of the most widely used human cancer cell lines in biomedical research and one of the main cell lines of ENCODE. Although the functional genomic and epigenomic characteristics of HepG2 are extensively studied, its genome sequence has never been comprehensively analyzed and higher order genomic structural features are largely unknown. The high degree of aneuploidy in HepG2 renders traditional genome variant analysis methods challenging and partially ineffective. Correct and complete interpretation of the extensive functional genomics data from HepG2 requires an understanding of the cell line's genome sequence and genome structure. Using a variety of sequencing and analysis methods, we identified a wide spectrum of genome characteristics in HepG2: copy numbers of chromosomal segments at high resolution, SNVs and Indels (corrected for aneuploidy), regions with loss of heterozygosity, phased haplotypes extending to entire chromosome arms, retrotransposon insertions and structural variants (SVs) including complex and somatic genomic rearrangements. A large number of SVs were phased, sequence assembled and experimentally validated. We re-analyzed published HepG2 datasets for allele-specific expression and DNA methylation and assembled an allele-specific CRISPR/Cas9 targeting map. We demonstrate how deeper insights into genomic regulatory complexity are gained by adopting a genome-integrated framework.


Assuntos
Mapeamento Cromossômico/métodos , Genoma Humano , Genômica/métodos , Haplótipos , Análise de Sequência de DNA/estatística & dados numéricos , Alelos , Aneuploidia , Metilação de DNA , Variação Estrutural do Genoma , Células Hep G2 , Sequenciamento de Nucleotídeos em Larga Escala , Humanos , Mutação INDEL , Cariotipagem , Perda de Heterozigosidade , Polimorfismo de Nucleotídeo Único , Retroelementos
15.
Genome Res ; 29(3): 472-484, 2019 03.
Artigo em Inglês | MEDLINE | ID: mdl-30737237

RESUMO

K562 is widely used in biomedical research. It is one of three tier-one cell lines of ENCODE and also most commonly used for large-scale CRISPR/Cas9 screens. Although its functional genomic and epigenomic characteristics have been extensively studied, its genome sequence and genomic structural features have never been comprehensively analyzed. Such information is essential for the correct interpretation and understanding of the vast troves of existing functional genomics and epigenomics data for K562. We performed and integrated deep-coverage whole-genome (short-insert), mate-pair, and linked-read sequencing as well as karyotyping and array CGH analysis to identify a wide spectrum of genome characteristics in K562: copy numbers (CN) of aneuploid chromosome segments at high-resolution, SNVs and indels (both corrected for CN in aneuploid regions), loss of heterozygosity, megabase-scale phased haplotypes often spanning entire chromosome arms, structural variants (SVs), including small and large-scale complex SVs and nonreference retrotransposon insertions. Many SVs were phased, assembled, and experimentally validated. We identified multiple allele-specific deletions and duplications within the tumor suppressor gene FHIT Taking aneuploidy into account, we reanalyzed K562 RNA-seq and whole-genome bisulfite sequencing data for allele-specific expression and allele-specific DNA methylation. We also show examples of how deeper insights into regulatory complexity are gained by integrating genomic variant information and structural context with functional genomics and epigenomics data. Furthermore, using K562 haplotype information, we produced an allele-specific CRISPR targeting map. This comprehensive whole-genome analysis serves as a resource for future studies that utilize K562 as well as a framework for the analysis of other cancer genomes.


Assuntos
Genoma Humano , Humanos , Células K562 , Cariótipo , Polimorfismo Genético , Sequenciamento Completo do Genoma
16.
Genet Med ; 21(9): 2126-2134, 2019 09.
Artigo em Inglês | MEDLINE | ID: mdl-30675030

RESUMO

PURPOSE: Despite the successful progress next-generation sequencing technologies has achieved in diagnosing the genetic cause of rare Mendelian diseases, the current diagnostic rate is still far from satisfactory because of heterogeneity, imprecision, and noise in disease phenotype descriptions and insufficient utilization of expert knowledge in clinical genetics. To overcome these difficulties, we present a novel method called Xrare for the prioritization of causative gene variants in rare disease diagnosis. METHODS: We propose a new phenotype similarity scoring method called Emission-Reception Information Content (ERIC), which is highly tolerant of noise and imprecision in clinical phenotypes. We utilize medical genetic domain knowledge by designing genetic features implementing American College of Medical Genetics and Genomics (ACMG) guidelines. RESULTS: ERIC score ranked consistently higher for disease genes than other phenotypic similarity scores in the presence of imprecise and noisy phenotypes. Extensive simulations and real clinical data demonstrated that Xrare outperforms existing alternative methods by 10-40% at various genetic diagnosis scenarios. CONCLUSION: The Xrare model is learned from a large database of clinical variants, and derives its strength from the tight integration of medical genetics features and phenotypic features similarity scores. Xrare provides the clinical community with a robust and powerful tool for variant prioritization.


Assuntos
Genômica/métodos , Aprendizado de Máquina , Doenças Raras/diagnóstico , Software , Biologia Computacional , Exoma/genética , Testes Genéticos , Variação Genética/genética , Genótipo , Sequenciamento de Nucleotídeos em Larga Escala , Humanos , Mutação , Fenótipo , Doenças Raras/genética
17.
Cell Stem Cell ; 24(2): 271-284.e8, 2019 02 07.
Artigo em Inglês | MEDLINE | ID: mdl-30686763

RESUMO

Tissue development results from lineage-specific transcription factors (TFs) programming a dynamic chromatin landscape through progressive cell fate transitions. Here, we define epigenomic landscape during epidermal differentiation of human pluripotent stem cells (PSCs) and create inference networks that integrate gene expression, chromatin accessibility, and TF binding to define regulatory mechanisms during keratinocyte specification. We found two critical chromatin networks during surface ectoderm initiation and keratinocyte maturation, which are driven by TFAP2C and p63, respectively. Consistently, TFAP2C, but not p63, is sufficient to initiate surface ectoderm differentiation, and TFAP2C-initiated progenitor cells are capable of maturing into functional keratinocytes. Mechanistically, TFAP2C primes the surface ectoderm chromatin landscape and induces p63 expression and binding sites, thus allowing maturation factor p63 to positively autoregulate its own expression and close a subset of the TFAP2C-initiated surface ectoderm program. Our work provides a general framework to infer TF networks controlling chromatin transitions that will facilitate future regenerative medicine advances.


Assuntos
Linhagem da Célula , Cromatina/metabolismo , Epiderme/metabolismo , Redes Reguladoras de Genes , Fator de Transcrição AP-2/metabolismo , Fatores de Transcrição/metabolismo , Proteínas Supressoras de Tumor/metabolismo , Diferenciação Celular , Ectoderma/citologia , Epigênese Genética , Retroalimentação Fisiológica , Humanos , Queratinócitos/citologia , Transcriptoma/genética
18.
Sci Data ; 5: 180261, 2018 12 18.
Artigo em Inglês | MEDLINE | ID: mdl-30561434

RESUMO

We produced an extensive collection of deep re-sequencing datasets for the Venter/HuRef genome using the Illumina massively-parallel DNA sequencing platform. The original Venter genome sequence is a very-high quality phased assembly based on Sanger sequencing. Therefore, researchers developing novel computational tools for the analysis of human genome sequence variation for the dominant Illumina sequencing technology can test and hone their algorithms by making variant calls from these Venter/HuRef datasets and then immediately confirm the detected variants in the Sanger assembly, freeing them of the need for further experimental validation. This process also applies to implementing and benchmarking existing genome analysis pipelines. We prepared and sequenced 200 bp and 350 bp short-insert whole-genome sequencing libraries (sequenced to 100x and 40x genomic coverages respectively) as well as 2 kb, 5 kb, and 12 kb mate-pair libraries (49x, 122x, and 145x physical coverages respectively). Lastly, we produced a linked-read library (128x physical coverage) from which we also performed haplotype phasing.


Assuntos
Benchmarking/métodos , Genoma Humano , Sequenciamento de Nucleotídeos em Larga Escala , Análise de Sequência de DNA/normas , Algoritmos , Biblioteca Gênica , Variação Genética , Humanos
19.
J Vis Exp ; (138)2018 08 03.
Artigo em Inglês | MEDLINE | ID: mdl-30124656

RESUMO

Conventional next-generation sequencing techniques (NGS) have allowed for immense genomic characterization for over a decade. Specifically, NGS has been used to analyze the spectrum of clonal mutations in malignancy. Though far more efficient than traditional Sanger methods, NGS struggles with identifying rare clonal and subclonal mutations due to its high error rate of ~0.5-2.0%. Thus, standard NGS has a limit of detection for mutations that are >0.02 variant allele fraction (VAF). While the clinical significance for mutations this rare in patients without known disease remains unclear, patients treated for leukemia have significantly improved outcomes when residual disease is <0.0001 by flow cytometry. In order to mitigate this artefactual background of NGS, numerous methods have been developed. Here we describe a method for Error-corrected DNA and RNA Sequencing (ECS), which involves tagging individual molecules with both a 16 bp random index for error-correction and an 8 bp patient-specific index for multiplexing. Our method can detect and track clonal mutations at variant allele fractions (VAFs) two orders of magnitude lower than the detection limit of NGS and as rare as 0.0001 VAF.


Assuntos
Genômica/métodos , Sequenciamento de Nucleotídeos em Larga Escala/métodos , Análise de Sequência de DNA/métodos , Análise de Sequência de RNA/métodos , Humanos
20.
J Bone Miner Res ; 31(3): 524-34, 2016 Mar.
Artigo em Inglês | MEDLINE | ID: mdl-26363184

RESUMO

Regulation of gene expression changes during chondrogenic differentiation by DNA methylation and demethylation is little understood. Methylated cytosines (5mC) are oxidized by the ten-eleven-translocation (TET) proteins to 5-hydroxymethylcytosines (5hmC), 5-formylcytosines (5fC), and 5-carboxylcytosines (5caC), eventually leading to a replacement by unmethylated cytosines (C), ie, DNA demethylation. Additionally, 5hmC is stable and acts as an epigenetic mark by itself. Here, we report that global changes in 5hmC mark chondrogenic differentiation in vivo and in vitro. Tibia anlagen and growth plate analyses during limb development at mouse embryonic days E 11.5, 13.5, and 17.5 showed dynamic changes in 5hmC levels in the differentiating chondrocytes. A similar increase in 5hmC levels was observed in the ATDC5 chondroprogenitor cell line accompanied by increased expression of the TET proteins during in vitro differentiation. Loss of TET1 in ATDC5 decreased 5hmC levels and impaired differentiation, demonstrating a functional role for TET1-mediated 5hmC dynamics in chondrogenic differentiation. Global analyses of the 5hmC-enriched sequences during early and late chondrogenic differentiation identified 5hmC distribution to be enriched in the regulatory regions of genes preceding the transcription start site (TSS), as well as in the gene bodies. Stable gains in 5hmC were observed in specific subsets of genes, including genes associated with cartilage development and in chondrogenic lineage-specific genes. 5hmC gains in regulatory promoter and enhancer regions as well as in gene bodies were strongly associated with activated but not repressed genes, indicating a potential regulatory role for DNA hydroxymethylation in chondrogenic gene expression.


Assuntos
Diferenciação Celular/genética , Condrogênese/genética , Citosina/análogos & derivados , Ativação Transcricional/genética , 5-Metilcitosina/análogos & derivados , Animais , Cartilagem/embriologia , Condrócitos/citologia , Condrócitos/metabolismo , Citosina/metabolismo , DNA Intergênico/genética , Proteínas de Ligação a DNA/metabolismo , Desenvolvimento Embrionário/genética , Extremidades/embriologia , Regulação da Expressão Gênica no Desenvolvimento , Camundongos , Proteínas Proto-Oncogênicas/metabolismo , Células-Tronco/citologia
SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA
...